Local TopN Index

Based on the local index, the local TopN index sorts the list of data record keys in each index entry by the specified field, and retains the TopN record keys. The local TopN index is also a local index in essence, so it can also be directly used as a local index.

The local TopN index supports sorting by multiple Value fields. The sorting type (ascending/descending) of each sort field can be different.

Multiple local TopN indexes can be created in a table.

1. Index Features

TopN indexes have the following two features:

  • TopN is also a local index. Like regular local indexes, TopN is a local index of data in a single data shard and is also defined in the table definition file;

  • The TopN index sorts the list of data record keys in the condition by the specified field, and only retains TopN data.

2. Application Scenarios

The TopN index is mainly used in scenarios similar to the ranking list (N <= 10000). Such as the scenario of ranking by points.

3. Instructions

3.1. Index Definition

The local TopN index is declared in the table definition file in a similar way to other local indexes. Similarly, the index name, index field and other attributes are specified in the index tag. The difference is that it needs to specify the sort field and sort order type, the number of row_keys to be retained, etc. in the customattr2 extension attribute of the table through the OrderBy and Limit attributes.

Take the ranking scenario as an example.

Table definition example:

Table definition description:

  • The rank_name field is a list name field. This field must be set as a subtable factor and local TopN index field to ensure that data of the same list is stored on the same node. At the same time, the access will also be based on the value of the field to the corresponding TopN index entries;

  • The uid and area_id fields are Key fields of the data. Together with rank_name, they form the Primary Key of the table.

  • The score1 field is the first sort field, and the data type is limited to integers and floating-point numbers;

  • The score2 field is the second sort field;

  • The uname field is the Value field of the first data. Multiple Value fields can be defined with unlimited data types;

  • The umail field is the Value field of the second data;

  • The index tag defines the local TopN index and specifies the index name and index field. It is the same as the regular local index definition, and the requirements are the same. If a local index is declared as a TopN index, it needs to specify key information such as the index's sort field, sort type, and list length in the customattr2 extended attribute of the table, as shown in the example.

  • IndexName: the index name to be declared as a TopN index, required;

  • OrderBy: list of sorted field names and types, required;

  • Limit: list length limit, value range is (0, 10000], required;

  • AutoDeleteDataRecord: true indicates that when data is "squeezed" out of all the lists, the database system will delete the data simultaneously. Otherwise, the data will be retained. Deleting the corresponding row_key in the index is optional, and the default value is false.

Special note:

Tables that contain local TopN index definitions have a built-in field (index) that stores the serial number of each data in the index. The limitations for this field are:

  1. This field does not need to be defined in the table definition, nor is it allowed to define a field with the same name in the table definition;

  2. This field can be queried directly, but cannot be updated directly.

3.2 Querying An Index

To hit the local TopN index, the query must strictly meet the following conditions:

  1. The query statement must contain a WHERE clause, and the filter conditions meet the requirements of the local index PartKey, that is, key1=value1 AND key2=value2 AND.

  2. The query statement must contain an ORDER BY clause, and the sort field order and sort type must be consistent with the sort field definition in the local TopN index definition. If the query results are returned in reverse order, it needs to nest another layer of queries in the outer layer, and the order of the sorting fields in the ORDRE BY of the outer query should be consistent with that in the local TopN index definition, but the order types are completely different. For details, see Example 5.5 in the next section.

  3. The query statement must contain a LIMIT clause, and the LIMIT value must be less than or equal to the LIMIT value in the local TopN index definition.

Examples

Suppose two local TopN indexes are defined in the table definition, as follows:

  • index1: OrderBy = score1: DESC; Limit = 1000

  • index2: OrderBy = score1: ASC, score2: DESC; Limit = 100

For different SQL statements, the hits of indexes are as follows:

SQL statement Index hits
SELECT uid, uname FROM rank_table Miss
SELECT uid, uname FROM rank_table WHERE rank_name = 'rank1' ORDER BY score1 ASC, score2 DESC LIMIT 10 index2
SELECT uid, uname FROM rank_table WHERE rank_name = 'rank1' ORDER BY score1 DESC, score2 DESC LIMIT 10 Miss
SELECT uid, uname FROM rank_table WHERE rank_name = 'rank1' ORDER BY score1 DESC Miss
SELECT uid, uname FROM rank_table WHERE rank_name = 'rank1' ORDER BY score1 DESC LIMIT 10 index1
SELECT uid, uname FROM rank_table WHERE rank_name = 'rank1' ORDER BY score1 DESC LIMIT 1001 Miss
SELECT uid, uname FROM rank_table WHERE rank_name = 'rank1' ORDER BY score2 DESC LIMIT 10 Miss

3.3. Typical Query Scenarios

3.3.1 Querying TopN data in the specified index entry

SELECT uid, uname FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC LIMIT 10;
SELECT uid, uname FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC, score2 ASC LIMIT 10;

Note: If LIMIT is greater than the actual number of row_keys in the index entry, the number of rows returned is the actual number of row_keys.

3.3.2 Querying the serial number of the specified data in the specified index entry

SELECT uid, __index__ FROM (SELECT * FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC LIMIT 10) WHERE uid=0;

Note: index is a built-in field. The index of each data in different TopN indexes may be different, and the value will change with the update of data. When the query hits a TopN index and the data is in the TopN index index Is a positive integer (counting from 0); Otherwise index The value of is fixed to - 1.

3.3.3 Querying the data in the specified range in the specified index entry

SELECT uid, score1, uname FROM (SELECT * FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC LIMIT 1000) WHERE 10 < score1 AND score1 < 20;
SELECT uid, score1, uname FROM (SELECT * FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC LIMIT 1000) WHERE 10 < __index__ AND __index__ < 20;

3.3.4 Querying the number of data in the specified index entry

SELECT count(1) FROM (SELECT * FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC, score2 ASC LIMIT 10);

Note: returns the number of actual data in the specified index entry, which is less than or equal to the maximum length limit of the index entry.

3.3.5 Querying the data in the specified index entry and return it in reverse order

SELECT uid, uname FROM (SELECT * FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC LIMIT 10) WHERE rank_name='rank1' ORDER BY score1 ASC;

Note: When the TopN index contains multiple sort fields and the whole result set is output in reverse order, the sorting types of all sort fields in the Order By outer query should be all related to the sorting types in the TopN index, and partial correlation is not allowed.

3.3.6 Traversing the data in the specified index entry

SELECT uid, score1, uname FROM rank_table WHERE rank_name='rank1' ORDER BY score1 DESC, score2 ASC LIMIT 1000;

4. Limitations on Use

  • It temporarily only supports the definition of local TopN indexes in the Generic tables of TDR.

  • Currently, it only supports accessing TopN indexes through MySQL Protocol Compatibility Interfaces.

  • Each primary key field in the table cannot exceed 32 bytes. After the table is created, the size of the primary key field cannot be modified.

  • The maximum value of the local TopN index 'N' is 10000.

  • The data types of sort fields are limited to integer and floating-point types.

  • The number of sort fields cannot exceed 8.

  • The empty position of the index cannot be automatically replaced after deleting the data in the local TopN index. For example, if the top 100 players are recorded in the index, the 101st player will not be automatically replaced after deleting the 50th player in the index.

  • After inserting new data into the local TopN index, the last data in the TopN index will be "squeezed out". At this time, the database system will not delete this "squeezed out" data synchronously by default, and the data can still be queried through other ways such as GET. If the data needs to be deleted synchronously, it needs to be specified that the AutoDeleteDataRecord attribute is true in the local TopN index definition.

5. Notations

\

results matching ""

    No results matching ""